A First Course in Probability: The Universality of Linearity

The Universality of Linearity is perhaps the most powerful shortcut in probability theory. It allows us to calculate the expectation of a sum of random variables by simply summing their individual expectations—regardless of whether those variables are independent, correlated, or mutually exclusive.

1. Foundations & Proposition 2.1

To understand why expectation behaves so linearly, we look at the Law of the Unconscious Statistician (LOTUS) for multivariate systems. Proposition 2.1 states that if $X$ and $Y$ have a joint probability mass function $p(x, y)$, then the expectation of any function $g(X, Y)$ is:

$$E[g(X, Y)] = \sum_{y} \sum_{x} g(x, y) p(x, y)$$

For continuous variables with joint PDF $f(x, y)$, the equivalent integral form is:

$$E[g(X, Y)] = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g(x, y) f(x, y) dx dy$$

2. The Linearity Principle

By applying LOTUS to the function $g(X, Y) = X + Y$, we derive the central theorem of this lesson: $E[X + Y] = E[X] + E[Y]$. This extends naturally to any finite collection:

$E\left[\sum_{i=1}^n X_i\right] = \sum_{i=1}^n E[X_i]$

This is "universal" because it requires no assumptions about the joint distribution. Whether variables are independent or heavily dependent, the average of the sum is the sum of the averages.

Example 2a: The Ambulance Problem

Consider an accident at location $X$ on a road of length $L$ and an ambulance at $Y$, where $X, Y \sim U(0, L)$ and are independent. Using the multivariate LOTUS to find $E[|X-Y|]$:

The joint PDF is $f(x, y) = 1/L^2$ for $0 \le x, y \le L$.

$$E[|X-Y|] = \int_0^L \int_0^L |x-y| \frac{1}{L^2} dx dy = \frac{L}{3}$$

3. Monotonicity and Bounds

Expectation preserves the order of random variables. If $X \ge Y$ for all outcomes, then $E[X] \ge E[Y]$. This follows from Example 2b: if $X - Y \ge 0$, then $E[X - Y] \ge 0$. Furthermore, if a variable is bounded such that $P\{a \le X \le b\} = 1$, then it follows that $a \le E[X] \le b$.

4. The Sample Mean (Example 2c)

Let $X_1, \dots, X_n$ be a sample from a distribution with mean $\mu$. The sample mean is defined as:

$$\bar{X} = \sum_{i=1}^{n} \frac{X_i}{n}$$

Due to linearity, $E[\bar{X}] = \frac{1}{n} \sum E[X_i] = \frac{n\mu}{n} = \mu$. The expected value of the sample mean is $\mu$, proving it is an unbiased estimator.

⚠️ The Infinite Caveat

When one is dealing with an infinite collection of random variables $X_i, i \ge 1$, it is not necessarily true that $E[\sum_{i=1}^\infty X_i] = \sum_{i=1}^\infty E[X_i]$. The interchange is valid only if:

The $X_i$ are all nonnegative random variables.
The series is absolutely convergent: $\sum_{i=1}^\infty E[|X_i|] < \infty$.

QUESTION 1

A player throws a fair die and simultaneously flips a fair coin. If heads, she wins twice the die value; if tails, she wins one-half the die value. What are her expected winnings?

3.500

4.375

5.250

3.125

QUESTION 2

Consider 3 trials with the same probability of success. Let $X$ be the total successes. If $E[X] = 1.8$, what is the largest possible value of $P\{X = 3\}$?

0.6

1.0

0.18

0.8

QUESTION 3

What is the expected value of the sum obtained when $n$ fair dice are rolled?

$n$

$3n$

$3.5n$

$6n$

QUESTION 4

In $n$ trials where trial $i$ succeeds with probability $p_i$, what is the expected total number of successes?

$n \cdot \max(p_i)$

$\prod p_i$

$\sum p_i$

QUESTION 5

Under what condition is $E[\sum_{i=1}^\infty X_i] = \sum_{i=1}^\infty E[X_i]$ guaranteed to be valid?

The $X_i$ are all independent.

The $X_i$ are all nonnegative.

The variables have the same mean.

The number of variables is prime.